Hashtables are a data structure that utilize key value pairs. This means every Node or Bucket has both a key, and a value.
The basic idea of a hashtable is the ability to store the key into this data structure, and quickly retrieve the value.
A hash is the ability to encode the key that will eventually map to a specific location in the data structure that we can look at directly to retrieve the value.
Example:
Key = "Cat"
Value = "Josie"
67 + 97 + 116 = 280
280 * 599 = 69648
69648 % 1024 = 16
Key gets placed in index of 16.
Collisions
A collision occurs when more than one key hashes to the same index in an array. As mentioned earlier, a “perfect hash” will never have any collisions.
Collisions are solved by changing the initial state of the buckets. Instead of starting them all as null we can initialize a LinkedList in each one
Hash Maps can have any number of buckets. If a hash map has only a few buckets it will be densely full and have many collisions.
It’s possible to compute the “load factor” of a hash table. The load factor tells us something about how full the hash table is.
Add()
Find()
The Find takes in a key, gets the Hash, and goes to the index location specified. Once at the index location is found in the array, it is then the responsibility of the algorithm the iterate through the bucket and see if the key exists and return the value.
contains()
The Contains method will accept a key, and return a bool on if that key exists inside the hashtable.
GetHash()
The GetHash will accept a key as a string, conduct the hash, and then return the index of the array where the key/value should be placed.
To achieve a good hashing mechanism, It is important to have a good hash function with the following basic requirements:
Easy to compute: It should be easy to compute and must not become an algorithm in itself.
Uniform distribution: It should provide a uniform distribution across the hash table and should not result in clustering.
Less collisions: Collisions occur when pairs of elements are mapped to the same hash value. These should be avoided.
class HashTable:
def __init__(self):
self.capacity = INITIAL_CAPACITY
self.size = 0
self.buckets = [None] * self.capacity
Node class
class Node:
def __init__(self, key, value):
self.key = key
self.value = value
self.next = None
Find method
def find(self, key):
index = self.hash(key)
node = self.buckets[index]
while node is not None and node.key != key:
node = node.next
if node is None:
return None
else:
return node.value